Goto

Collaborating Authors

 law student


The Verification-Value Paradox: A Normative Critique of Gen AI in Legal Practice

Yuvaraj, Joshua

arXiv.org Artificial Intelligence

It is often claimed that machine learning-based generative AI products will drastically streamline and reduce the cost of legal practice. This enthusiasm assumes lawyers can effectively manage AI's risks. Cases in Australia and elsewhere in which lawyers have been reprimanded for submitting inaccurate AI-generated content to courts suggest this paradigm must be revisited. This paper argues that a new paradigm is needed to evaluate AI use in practice, given (a) AI's disconnection from reality and its lack of transparency, and (b) lawyers' paramount duties like honesty, integrity, and not to mislead the court. It presents an alternative model of AI use in practice that more holistically reflects these features (the verification-value paradox). That paradox suggests increases in efficiency from AI use in legal practice will be met by a correspondingly greater imperative to manually verify any outputs of that use, rendering the net value of AI use often negligible to lawyers. The paper then sets out the paradox's implications for legal practice and legal education, including for AI use but also the values that the paradox suggests should undergird legal practice: fidelity to the truth and civic responsibility.


LawFlow: Collecting and Simulating Lawyers' Thought Processes on Business Formation Case Studies

Das, Debarati, Le, Khanh Chi, Parkar, Ritik Sachin, De Langis, Karin, Madson, Brendan, Berryman, Chad M., Willis, Robin M., Moses, Daniel H., McDonnell, Brett, Schwarcz, Daniel, Kang, Dongyeop

arXiv.org Artificial Intelligence

Legal practitioners, particularly those early in their careers, face complex, high-stakes tasks that require adaptive, context-sensitive reasoning. While AI holds promise in supporting legal work, current datasets and models are narrowly focused on isolated subtasks and fail to capture the end-to-end decision-making required in real-world practice. To address this gap, we introduce LawFlow, a dataset of complete end-to-end legal workflows collected from trained law students, grounded in real-world business entity formation scenarios. Unlike prior datasets focused on input-output pairs or linear chains of thought, LawFlow captures dynamic, modular, and iterative reasoning processes that reflect the ambiguity, revision, and client-adaptive strategies of legal practice. Using LawFlow, we compare human and LLM-generated workflows, revealing systematic differences in structure, reasoning flexibility, and plan execution. Human workflows tend to be modular and adaptive, while LLM workflows are more sequential, exhaustive, and less sensitive to downstream implications. Our findings also suggest that legal professionals prefer AI to carry out supportive roles, such as brainstorming, identifying blind spots, and surfacing alternatives, rather than executing complex workflows end-to-end. Our results highlight both the current limitations of LLMs in supporting complex legal workflows and opportunities for developing more collaborative, reasoning-aware legal AI systems. All data and code are available on our project page (https://minnesotanlp.github.io/LawFlow-website/).


It Cannot Be Right If It Was Written by AI: On Lawyers' Preferences of Documents Perceived as Authored by an LLM vs a Human

Harasta, Jakub, Novotná, Tereza, Savelka, Jaromir

arXiv.org Artificial Intelligence

Large Language Models (LLMs) enable a future in which certain types of legal documents may be generated automatically. This has a great potential to streamline legal processes, lower the cost of legal services, and dramatically increase access to justice. While many researchers focus their efforts on proposing and evaluating LLM-based applications supporting tasks in the legal domain, there is a notable lack of investigations into how legal professionals perceive content if they believe it has been generated by an LLM. Yet, this is a critical point as over-reliance or unfounded skepticism may influence whether such documents bring about appropriate legal consequences. This study is the necessary analysis in the context of the ongoing transition towards mature generative AI systems. Specifically, we examined whether the perception of legal documents' by lawyers (n=75) varies based on their assumed origin (human-crafted vs AI-generated). The participants evaluated the documents focusing on their correctness and language quality. Our analysis revealed a clear preference for documents perceived as crafted by a human over those believed to be generated by AI. At the same time, most of the participants are expecting the future in which documents will be generated automatically. These findings could be leveraged by legal practitioners, policy makers and legislators to implement and adopt legal document generation technology responsibly, and to fuel the necessary discussions into how legal processes should be updated to reflect the recent technological developments.


Calpric: Inclusive and Fine-grain Labeling of Privacy Policies with Crowdsourcing and Active Learning

Qiu, Wenjun, Lie, David, Austin, Lisa

arXiv.org Artificial Intelligence

A significant challenge to training accurate deep learning models on privacy policies is the cost and difficulty of obtaining a large and comprehensive set of training data. To address these challenges, we present Calpric , which combines automatic text selection and segmentation, active learning and the use of crowdsourced annotators to generate a large, balanced training set for privacy policies at low cost. Automated text selection and segmentation simplifies the labeling task, enabling untrained annotators from crowdsourcing platforms, like Amazon's Mechanical Turk, to be competitive with trained annotators, such as law students, and also reduces inter-annotator agreement, which decreases labeling cost. Having reliable labels for training enables the use of active learning, which uses fewer training samples to efficiently cover the input space, further reducing cost and improving class and data category balance in the data set. The combination of these techniques allows Calpric to produce models that are accurate over a wider range of data categories, and provide more detailed, fine-grain labels than previous work. Our crowdsourcing process enables Calpric to attain reliable labeled data at a cost of roughly $0.92-$1.71 per labeled text segment. Calpric 's training process also generates a labeled data set of 16K privacy policy text segments across 9 Data categories with balanced positive and negative samples.


Can ChatGPT Perform Reasoning Using the IRAC Method in Analyzing Legal Scenarios Like a Lawyer?

Kang, Xiaoxi, Qu, Lizhen, Soon, Lay-Ki, Trakic, Adnan, Zhuo, Terry Yue, Emerton, Patrick Charles, Grant, Genevieve

arXiv.org Artificial Intelligence

Large Language Models (LLMs), such as ChatGPT, have drawn a lot of attentions recently in the legal domain due to its emergent ability to tackle a variety of legal tasks. However, it is still unknown if LLMs are able to analyze a legal case and perform reasoning in the same manner as lawyers. Therefore, we constructed a novel corpus consisting of scenarios pertain to Contract Acts Malaysia and Australian Social Act for Dependent Child. ChatGPT is applied to perform analysis on the corpus using the IRAC method, which is a framework widely used by legal professionals for organizing legal analysis. Each scenario in the corpus is annotated with a complete IRAC analysis in a semi-structured format so that both machines and legal professionals are able to interpret and understand the annotations. In addition, we conducted the first empirical assessment of ChatGPT for IRAC analysis in order to understand how well it aligns with the analysis of legal professionals. Our experimental results shed lights on possible future research directions to improve alignments between LLMs and legal experts in terms of legal reasoning.


Improving Access to Justice for the Indian Population: A Benchmark for Evaluating Translation of Legal Text to Indian Languages

Mahapatra, Sayan, Datta, Debtanu, Soni, Shubham, Goswami, Adrijit, Ghosh, Saptarshi

arXiv.org Artificial Intelligence

Most legal text in the Indian judiciary is written in complex English due to historical reasons. However, only about 10% of the Indian population is comfortable in reading English. Hence legal text needs to be made available in various Indian languages, possibly by translating the available legal text from English. Though there has been a lot of research on translation to and between Indian languages, to our knowledge, there has not been much prior work on such translation in the legal domain. In this work, we construct the first high-quality legal parallel corpus containing aligned text units in English and nine Indian languages, that includes several low-resource languages. We also benchmark the performance of a wide variety of Machine Translation (MT) systems over this corpus, including commercial MT systems, open-source MT systems and Large Language Models. Through a comprehensive survey by Law practitioners, we check how satisfied they are with the translations by some of these MT systems, and how well automatic MT evaluation metrics agree with the opinions of Law practitioners.


How AI could keep law students in debt forever

FOX News

Attorney Bryan Rotella said the growing use of AI in legal services will increase efficiency but could threaten the jobs of legal assistants and young lawyers. The rise of artificial intelligence could create a ripple effect across the legal industry, putting law school students out of entry-level jobs before even entering the workforce and stripping them of necessary experience to become good lawyers, an attorney of over 20 years said. "What concerns me is that you're going to have a whole bunch of people coming out of law school with huge loans, which we already know is a crisis, and they're going to be outsourced by this artificial intelligence," Bryan Rotella, attorney and founder of GenCo Legal, told Fox News. "I don't know that anyone's warning them of that." As AI is increasingly incorporated into industries like health care, financial services and the legal field, Rotella said there are many ways this technology can be used to aid professionals.


Can a chatbot earn a JD? This one averaged C-plus on law school exams

#artificialintelligence

An artificial intelligence tool called ChatGPT averaged a C-plus on exams at the University of Minnesota Law School, according to four law professors who gave it a try. The law professors used ChatGPT to answer the questions and then blindly graded the answers, along with answers by real students, report Reuters and Insider. The average C-plus grade was still below that of law students, who had a B-plus average. And ChatGPT's performance, while earning passing grades, was at or near the bottom of the class. The professors' findings are available here.


Toward an Intelligent Tutoring System for Argument Mining in Legal Texts

Westermann, Hannes, Savelka, Jaromir, Walker, Vern R., Ashley, Kevin D., Benyekhlef, Karim

arXiv.org Artificial Intelligence

We propose an adaptive environment (CABINET) to support caselaw analysis (identifying key argument elements) based on a novel cognitive computing framework that carefully matches various machine learning (ML) capabilities to the proficiency of a user. CABINET supports law students in their learning as well as professionals in their work. The results of our experiments focused on the feasibility of the proposed framework are promising. We show that the system is capable of identifying a potential error in the analysis with very low false positives rate (2.0-3.5%), as well as of predicting the key argument element type (e.g., an issue or a holding) with a reasonably high F1-score (0.74).


Corpus for Automatic Structuring of Legal Documents

Kalamkar, Prathamesh, Tiwari, Aman, Agarwal, Astha, Karn, Saurabh, Gupta, Smita, Raghavan, Vivek, Modi, Ashutosh

arXiv.org Artificial Intelligence

In populous countries, pending legal cases have been growing exponentially. There is a need for developing techniques for processing and organizing legal documents. In this paper, we introduce a new corpus for structuring legal documents. In particular, we introduce a corpus of legal judgment documents in English that are segmented into topical and coherent parts. Each of these parts is annotated with a label coming from a list of pre-defined Rhetorical Roles. We develop baseline models for automatically predicting rhetorical roles in a legal document based on the annotated corpus. Further, we show the application of rhetorical roles to improve performance on the tasks of summarization and legal judgment prediction. We release the corpus and baseline model code along with the paper.